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Stochastic Sensor Scheduling for Energy Constrained 
Estimation in Multi-Hop Wireless Sensor Networks 

Yilin Mo*, Emanuele Garone^, Alessandro Casavola^, Bruno 
Sinopoli* 



Abstract — Wireless Sensor Networks (WSNs) enable a wealth of new 
applications where remote estimation is essential. Individual sensors 
simultaneously sense a dynamic process and transmit measured infor- 
mation over a shared channel to a central fusion center. The fusion 
center computes an estimate of the process state by means of a Kalman 
filter. In this paper we assume that the WSN admits a tree topology 
with fusion center at the root. At each time step only a subset of 
sensors can be selected to transmit observations to the fusion center 
due to a limited energy budget. We propose a stochastic sensor selection 
algorithm that randomly selects a subset of sensors according to certain 
probability distribution, which is opportunely designed to minimize the 
asymptotic expected estimation error covariance matrix. We show that 
the optimal stochastic sensor selection problem can be relaxed into a 
convex optimization problem and thus solved efficiently. We also provide 
a possible implementation of our algorithm which does not introduce any 
communication overhead. The paper ends with some numerical examples 
that show the effectiveness of the proposed approach. 

Index Terms — Wireless Sensor Networks,Optimization, State Estima- 
tion. 



I. Introduction 

Sensor networks span a wide range of applications, including 
environmental monitoring and control, health care, home and office 
automation and traffic control Q). In these applications, estimation 
algorithms like Kalman filters can be used to undertake state estima- 
tion tasks based on lumped-parameter models of distributed physical 
phenomena. However, WSN operating constraints, such as power 
limitations, often make it difficult to collect data from every sensor 
at the sampling rates required for an effective monitoring. These 
considerations have led to the development of sensor scheduling 
strategies able to select, at each time step, the subset of reporting 
sensors that minimizes a certain cost function, usually related to the 
expected estimation error. 

Sensor network energy consumption minimization and, conse- 
quently, lifetime maximization problems have been active areas of 
research over the past few years, as researchers realized that energy 
limitations constitute one of the major obstacles to the extensive adop- 
tion of such a technology. Sensor networks energy minimization is 
typically accomplished via efficient MAC protocols (2) or via efficient 
scheduling of sensor states (3), |4). In |5), Xue and Ganz showed that 
the lifetime of sensor networks is influenced by transmission schemes, 
network density and transceiver parameters with different constraints 
on network mobility, position awareness and maximum transmission 
ranges. Chamam and Pierre (6) proposed a sensor scheduling scheme 
capable of optimally putting sensors in active or inactive modes. 
Shi et. al (7) considered sensor energy minimization as a mean to 
maximize the network lifetime while guaranteeing a desired quality 
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of the estimation accuracy. Moreover in (8), they proposed a sensor 
tree scheduling algorithm which leads to longer network lifetimes. 

Conversely, optimizing the performance of sensor networks under 
given energy constraints, which can be seen as the dual problem of 
network energy minimization, has also been studied by several re- 
searchers. Such a constrained optimization problem has been studied 
for continuous-time linear systems in |9) and 1101 . In 1111 . the author 
computed the optimal sensor scheduling for the estimation of a Hid- 
den Markov Model based system. For discrete-time linear systems, 
methods like dynamic programming [12J or greedy algorithms 1131 
have been proposed to find the optimal sensor scheduling over long 
time horizons. 

Another important contribution on the topic is the work of Joshi 
and Boyd 1141 . where a general single-step sensor selection problem 
was formulated and solved by means of convex relaxation techniques. 
Such a paper provides a very general framework that can handle 
various performance criteria and energy and topology constraints. 
Following this work, Mo et al. 1151 . 1161 . 1171 showed that multi- 
step sensor selection problems can also be relaxed into convex 
optimization problems and thus efficiently solved. 

A very different approach with respect to the above deterministic 
solutions has been proposed in (18|. There, the authors proposed a 
stochastic sensor selection algorithm in networks endowed with star 
topology. The algorithm is based on the idea that at each time step the 
sensors randomly and autonomously choose if sending measurements 
or not according to a certain probability distribution. Therefore, the 
probability distributions become the optimization parameters, which 
are chosen to minimize the expected steady-state error covariance 
matrix. The authors argued that such a stochastic approach has several 
advantages over the conventional approaches: for example, it is easier 
to take into account random communication channel failures, which is 
a quite common issue in wireless sensor networks. The most relevant 
limitation of the results presented in that paper hinges upon the 
assumption that only one sensor at the time can transmit its data 
at each sampling period, which is a strong assumption and requires 
a precise coordination between sensors. 

In the present work, we go further on by proposing a stochastic 
sensor selection algorithm that not only overcomes the above limi- 
tation but also solves the routing problem under the assumption that 
wireless sensor network has a tree topology. The proposed approach 
may be summarized as follows. The sensors are randomly selected 
according to a certain probability distribution that is designed so 
as to minimize the expected asymptotic estimation error covariance 
matrix while maintaining the connectivity of the network. In order 
to make the determination of the above probability distribution 
tractable, the problem is relaxed and, instead of the original objective 
function, a lower bound to the expected estimation error covariance 
matrix is minimized. Such a choice reduces the optimal sensor 
scheduling design problem into a convex optimization problem. The 
advantages of the stochastic schedule over deterministic schedule can 
be summerized as threefold: 

1) The search space of the stochastic formulation is continuous 
and convex, while the search space of deterministic formulation 
is discrete. Hence, the search of the optimal deterministic 
schedule can be formulated as an integer programming prob- 
lem, which makes the task potentially harder than the stochastic 
counterpart. 

2) The expected performance of the stochastic formulation can 
be better than the deterministic one. Moreover, due to the 
ergodicity of the random Riccati equation, we can prove that 
under mild assumptions almost every sample path of the 
stochastic schedule is better than the deterministic one if the 
system runs long enough. 



3) The stochastic schedule can be implemented with the same 
computation and communication cost as the deterministic one. 

The rest of the paper is organized as follows. In Section [TT] we 
describe our system and communication model and introduce the 
deterministic sensor and stochastic selection problems. We further 
present an ergodicity result on the performance of the stochastic 
sensor scheduling method to show that stochastic formulation could 
improve the performance. In Section [TTTJ we relax the stochastic 
sensor selection algorithm to render it solvable and propose an 
possible implementation of our algorithm. Some numerical examples 
on the monitoring of a diffusion process are provided in Section [W] 
and, finally, Section [V] concludes the paper. 

II. Sensor Selection: From Deterministic to Stochastic 
Formulation 

A. System Description 

Consider the following discrete-time LTI system 



x k +i — Ax k + Wk 



(1) 



where Xk G R" represents the state and w k € R n the disturbance. It 
is assumed that w k and xo are independent Gaussian random vectors, 
xo ~ A/"(0, E) and w k ~ A/"(0, Q), where E, Q > are positive 
definite matrices. A wireless sensor network composed of rn sensing 
devices si,...,s m and one fusion center so is used to monitor the 
state of system ([TJ. The measurement equation is 



y k — Cx k + v k , 



(2) 



where y k = [y h ,i,y'k,2, ■•• ,Vk,m]' € Rm is the measurement 
vectojj. Each element y k ,i represents the measurement of sensor i 
at time k. C — [C'[ , . . . , C' m ] is the observation matrix and the 
matrix pair (C, A) is assumed observably Vk ~ A/"(0, R) is the 
measurement noise, assumed to be independent of xo and w k . We 
also assume that the covariance matrix R = diag(r\, . . . ,r m ) is 
diagonal, which means that the measurement noise at each sensor is 
independent of all others and nonsingular, that is Ti > 0, i — 1, ..., rn. 

Let's introduce an oriented communication graph G = {V , E} in 
order to model the communication amongst nodes, where the vertex 
set V — {so, si, . . . ,s m } contains all sensor nodes, including the 
fusion center. The set of edges E C V X V represents the available 
connections, i.e. (i,j) £ E implies that the node s% may send 
information to the node $j . Moreover, it is assumed that each node of 
the sensor network acts as a gateway for a specific number of other 
nodes, which means that every time it communicates with another 
node it sends, in a single packet, its own measurements collected 
together with all data received from the other nodes. 

We always assume that, for every sensor in the network, there 
exists one and only one communication path to the fusion center, 
i.e. the sensor network has a directed tree topology. Moreover, we 
assume that each link has an associated weight c(eij ) which indicates 
the energy consumed when $i directly transmits a packet to Sj. For 
the sake of legibility, we sometimes abbreviate c(eij) as a, i = 
1, . . . ,7Ti because, in the assumed topology, each sensor node has 
only one outgoing edge. 

Remark 1. The tree topology assumption may be a restrictive 
hypothesis in the general case where usually one sensor can commu- 
nicate with several nearby nodes. However, it is worth to remark 
that typical communication network graphs can be approximated 

'The ' on a matrix always means transpose. 

2 The assumption of observability is without loss of generality since we 
could perform Kalman decomposition and only consider the observable space 
even if the system is not observable. 



by a collection of "representative" spanning trees (e.g. the first m 
spanning trees of the spanning tree enumeration M9V ). 

B. Stochastic v.s. Deterministic Sensor Selection 

Because sensor measurements usually contain redundant informa- 
tion, in order to reduce the energy consumption it would be highly 
desirable to use a minimal subset of sensors at each sampling time. 
However, in a tree topology, we cannot select arbitrary subsets of 
nodes but we are forced to select nodes (and connections) such that, 
for each selected node, there exists a communication path to the 
fusion node. As a result, any possible transmission topology of G is 
a subtree T = {Vt ,E t }, with s € Vt, Vt C V and E T C E. 
Hereafter, Vt denote the selected subset of sensors and Et the 
communication paths used by the sensors to transmit observations 
to the fusion center. We also denote by T the set of all possible 
transmission topologies T (i.e. the set of all possible subtrees of G 
containing so). 

It is straightforward to show that, for a transmission tree T, the 
total transmission energy consumption is given by[j 



S(T) = J2 c ( e ) 



Suppose that at each time k we randomly select a tree T from 
T and each sensor in T transmits its observation back to the fusion 
node according to the topology T. Let n k ,T be the probability that 
the transmission tree T is selected at time k. Then, we may define 

Pk,i — ^2 ^k,T (3) 

TeT, Si ev T 

the marginal probability that sensor i is selected at time k. Further, 
let us define p fe = \pk,i, ■ ■ ■ ,Pk,m]' and Tv k = [vrfe,T 15 • ■ ■ ,-Kk,T lTl ]' 
to be the vectors of all pk,iS and n k .TS respectively. We can 
introduce the binary random variable 5 k ,T such that 5 k .T = 1 if 
the transmission tree T is selected at time k and S k ,r = otherwise. 
Similarly, let us also define the binary random variable jk,i to be 1 
if sensor i is selected at time k and otherwise. It is well known 
that the Kalman filter is still the optimal filter[18|. Suppose that 
Vt = {so, Si 1 , . . . , Sj . }, then we can define 

Ct ± [C' il ,C' l2 ,...,C' lj ]',R T ±diag(r n ,...,r 1] ). (4) 

It can be proved that the estimation error covariance Pk and the 
information matrix Zjj of the Kalman filter satisfy the following 
recursive equations: 



-Ffcifc— l +C'tR t C t 



(5) 



where P k \k-i = APk-iA' + Q. Let us define gw k ,k as a random 
operator such that 



Z, k ,k{X) 4 J2 Sh,T9T(X), 



where P(5k,T = 1) = itk,T, and 



9t(X) 4 
We have 



(AXA' + Q)~ 1 + Y. 



CiCi 



"i€V T , Si^s 



Pk - gir k ,k(Ph-l) 



(6) 



(7) 



(8) 



3 Here we assume that cost(eij) is constant regardless of number of 
observations contained in the packet. This is realistic in most of the cases, 
especially when measurements are of simple type, such as low precision scalar 
values, and the transmission overhead, e.g. header, handshaking protocol, 
dominates the payload. 

4 The information matrix is the inverse of estimation error covariance 



In this paper we are more interested in a time-invariant schedule 7tt. 
Hence, let us define 



gT(X) = lim E(sr w , fe o g w , k -i o ■ 

k — >-oo 



>g*,i 



)(X), 



(9) 



when the limit exists. Otherwise, g^(X) is infinity. Note that g5J° 
is a deterministic function, which indicates the limit performance of 
stochastic sensor selection when the fixed schedule tt is used. It is 
easy to see that 

lim EP fe = g^(E), 

fc— >oo 

when the fixed schedule n is used and g^°(E) < oo. 

Since transmission trees are randomly selected, Py. is a random 
matrix. Thus, we only minimize the asymptotic expected estimation 
error covariance matrix while requiring that the expected energy 
consumption does not exceed a designated threshold Ed- The problem 
of finding the optimal fixed stochastic schedule that minimizes 
the expected asymptotic estimation error covariance matrix can be 
formulated as 

Problem 1 (Fixed Random Schedule that Optimizes Expected 
Asymptotic Performance). 



minimize £race(gjJ°(E)) 

subject to y tttE(T) < Ed, kt 

tgT 






= 1. 



Since the deterministic schedule can be seen as a subset of 
stochastic schedule, where nk,T are forced to be either or 1, 
the problem of finding the optimal fixed deterministic schedule that 
minimizes the asymptotic estimation error covariance matrix can be 
formulated as 

Problem 2 (Fixed Deterministic Schedule that Optimizes Asymptotic 
Performance). 

minimize trace(gJJ°(E)) 

7T 

subject to y tttE(T) < Ed, ttt = or 1, > ttt — 1. 

T6T T£T 

Remark 2. In Problem [JJ we require that the expected energy 
consumption does not exceed a certain energy budget. In real ap- 
plications dijferent constraints may be considered (e.g. requirements 
on the sensor lifetime). However, it can be shown (see e.g. H14V ) that 
many of these constraints can be easily integrated into the above 
framework. 

Remark 3. It is worth noticing that at each sampling time, the energy 
cost of deterministic schedule cannot exceed the designated threshold 
Ed- This is important to be remarked in order to understand why 
stochastic sensor selections, being allowed to use more energy at 
one single sampling period, can achieve better performance than the 
above deterministic formulation. 

It is also worth noticing that a periodic schedule can also be 
formulated as Problem\2\by enlarging the state space. As a result, all 
the results in this section can be generalized in to periodic schedule. 
However, in Section I///I we focus only on time-invariant schedule. 

Remark 4. Another main difference between Problem [JJ and Prob- 
lem \2\ is that, the search space of deterministic schedule is discrete, 
which that of stochastic schedule is continuous and convex. This 
brings several advantages. First, the deterministic schedule can be 
seen as a particular kind of random schedule, where 7Tk,TS ore binary. 
As a result, stochastic sensor selection strategies could possibly 
improve the sensor selection performance (at least in the expected 
sense). The second advantage is that the feasible set ty^.t is convex, 



which allows us to further manipulate the problem into a convex 
form. 

As is commented above, the expected performance of the optimal 
stochastic schedule is better than the deterministic counter part. 
Let 7r* be the optimal stochastic schedule and 7r*i be the optimal 
deterministic schedule, we have 

lim Ktrace(Pk("K*)) < lim trace(Pk(^d)) > 

fc— >oo k—too 

which implies that 

N 1 N 1 

J5LX N E (*™ceP fc (7r*)) < lim lV -traceP k {-K* d ). 



2V->oo z — ' JV 



JV->oo z — ' JV 

fc=l 



To strength this result, the following theorem states that if the 
optimal stochastic schedule is allowed to run for a long time, then 
almost every sample path of the stochastic schedule is potentially 
better than deterministic one in the average sense. 

Theorem 1. Suppose that the fixed schedule n* is the solution 
of Problem [JJ If the linear system and tt* satisfy the following 
assumptions: 

1) A is invertible, (A,Q ' 2 ) is controllable; 

2) there exists a transmission topology T with 7rJ > such that 
(Ct, A) is obser\'able 

and the stochastic process {Pk} satisfies: Pk = gn*,k(Pk-i), Pa = 
E, then almost surely the following inequality holds 

1 - 
lim Tf >trace(P k ) < trace(g™ (E)). (10) 

N— >oo iv — 



Proof: It is easy to check that all the assumptions in the Theorem 
3.4 of |20| hold. As a result, there exists an ergodic stationary process 
{Pk} which satisfies Pk = g-7r*,k(Pk-i)- Moreover, 

lim \\P k -Pk\\ =0.a.s. 

fc— yoo 

We want to prove that E(irace(P )) is less than or equal to 
£race(g5J°(E)) and hence is finite. Because Pk is ergodic, and Pk 
converges to Pk almost surely, we know that 

1 N 1 N _ 

lim — ) mm(trace(Pk), M) = lim — ) min(trace(Pk),M) 

N— >oo N L — ' iV— >oo N L — ' 

fc=l fc=l 

= E[min (trace (Fo), M)], a.s. 
where M > is a constant. By the definition of gJJ°, we know that 



irace(g^°(E)) > lim E 

JV— >oo 



1 N 

— JJmin(trace(P fe ),M) 



= E 



1 
lim —y^ min(trace(Pk), M) 



iV->oo N 



= E [min (trace (P ), M)]. 



The second equality follows from the Dominated Convergence The- 
orem. Now, let M — > oo. By Monotone Convergence Theorem it 
results that 

E[ir-ace(Po)] = lim E[min(trace(Po), M)] < trace(g^°(E)), 

M— >oo 

which proves that E[trace(Po)] < £race(g$J°(E)). Hence, by 
ergodicity, we obtain 

1 N 1 N - - 

lim — ) trace(Pk) — lim — ^^ trace(Pk) = E(irace(Po)) 

N— >oo JV ^ — ' JV— >oo JV ^ — ' 

fc = l fe = l 

< iroce(g^(E)), a.s. 



Remark 5. Combining Remark \4\ with the results of Theorem [7] we 
can conclude that the average performance of almost every sample 
path of the optimal stochastic schedule is better than its deterministic 
counterpart. 

Before moving forward, it is worth pointing out that Problem [T] 
are still numerical intractable. In fact: 

1) it is usually difficult to express EPoo as an explicit function of 

TTl.T, ■ ■ ■ , Tk,TU 

2) since \T\ is large, the number of optimization variables and 
constraints may be not polynomial with respect to the number 
of nodes. 

In the next section, we will devise a possible relaxation method that 
allows one to overcome the above two problems. 

III. Relaxation and Implementation 

In this section, we first relax Problem [T] to a convex relaxation 
problem. We then propose a possible implementation of our stochastic 
schedule without introducing communication and computation over- 
head. 

A. Relaxation 

In this subsection we consider a convex relaxation of Problem Q] 
To this end, let us define a lower bound Lk to EJ\- by means of the 
following theorem, whose proof is reported in the Appendix. 

Theorem 2. Let L = Po and 

-i 



There are drawbacks of the above formulation: 1) the optimization 
problem still has a number of constraints and variables depending 
on \T\, a number which is not, in the general case, polynomial 
with respect to m; 2) L°° is still not explicity. Let us first drop 
the dependence on ttt- To this end, define the set of feasible p for 
Problem [3] 



-p A 



37T, 2_j Kt£{T) < £ d , TTT > 0, 2J n T = 1, Pi = JJ 7PT 

TeT TeT Si£V T 



The following results can be easily proved: 

Proposition 1. The energy cost of a given collection of tree selection 
probabilities 7Tfc.T,VT £ T is a linear function of the resulting 
marginal probability: 



TeT i=i 

Proposition 2. If p, £ [0, 1] and if it satisfies 

Pi 5: Pj > if j i s a parent of i 



(17) 



(18) 



then there exists at least one collection of tree selection probabilities 
n, such that 



7TT > 0, ^2 n T — 1, Pi— V] TTT- 
TeT SiGVr 



(19) 



Lk = £ fc |i_i+X]PM 



a a 



(id 



where Lk\k-i = ALk-iA + Q. The following inequalities hold: 

EP fe > L k . (12) 

To further improve the legibility, let us define the function 



L(X,p)=- 



(AXA' + Q)- 1 +Y J Pi 



OiCi 



(13) 



Conversely, if there exists Ttk such that ( |19t holds, then pk,i G [0, 1] 
and satisfies i ]1X| i . 

By exploiting the above Propositions we can reformulate the feasible 
set V as follows 

( m 

~P = S P Pi g [0, 1]. ^ CiPi < £d, Pi < Pi, if j is parent of : 
I i=i 

(20) 

and we can rewrite Problem [3] as 

Problem 4 (Asymptotic Lower Bound for Random Transmission Tree 
Selection). 



where X £ R" xn is positive semidefinite and p = [pi, . . . ,p m ]' G 
R m . Moreover, let us define, 

L (1) (X,p) = L(X,p),L (fe) (Xp) = L(L (fc - 1) (X,p),p), (14) 

with 

L°°(X,p)= lim L (fe) (X,p), (15) 

when the limit exists. Hence JIU can be simplified as 

L fe = L(L fe _ 1 , Pfe ). (16) 

By replacing the objective function in ProblemQjwith its lower bound, 
we obtain the following: 

Problem 3 (Asymptotic Lower Bound for Random Transmission Tree 
Selection). 

minimize irace(L°°(E, p)) 

w T , p 

subject to y ttt£{T) < £d, 

TeT 

T > o, 22 nT = - 1 ' Pi = zZ 7rT ' 
TeT s t ev T 

5 The readers can refer to 1211 for more information. 



mimnize 

P eR m 

subject to 



trace(L°°(T,, p)) 

per. 



Now the main difficulty to solve the above problem is that 
L°° (X, p) is in general not convex in p. Moreover, the exact form of 
L°° (X, p) is unknown. To overcome those limitations, we propose 
the following algorithm: 

1) Define p = (£d/(2*Li c 0) lm > wnere lm G R m is a vector 
with all one entries and choose the matrix Lq — L°°(I n , po). 

2) Let Lfc and pfc be the solution of the following optimization 
problem 

Problem 5 (Random Sensor Selection with Descend Con- 
straint). 

minimize trace(Lk){= trace(L(Lk-i,Pk))) 

p fc SK™ 

subject to Lk < Lk-i, Pfc £ "P. 

3) Choose p* as an accumulation point of pjj. Then 
L°°(A',p*) = lirrifc^ L k for any A > 0. 

Before proving the feasibility of the above algorithm, we want to 
point out that our algorithm is greedy. In fact, we try to minimize 

6 An accumulation point of a sequence is the limit of a converging 
subsequence 



the lower bound for the next step in the hope of reducing the final 
asymptotic lower bound. As a result, it is suboptimal by nature. The 
following theorem gives a characterization of the main features of 
the proposed algorithm. 

Theorem 3. L(X, p) is convex with respect to p and it is concave 
and monotonically increasing with respect to X. 

Due to the convexity of L and V, Problem[5]is a convex optimization 
problem with O(ra) optimization variables and O(ra) constraints. 
Thus, it can be solved efficiently. For example, if interior-points meth- 
ods is used, then the complexity is 0(to 3 ). For detailed discussions 
about the computational burdens, please refer to 1141 . 

Theorem 4. The following statements are true for the proposed 
algorithm: 

1) I/o exists. 

2) Problem \5\ is always feasible. 

3) p* exists and p* G V. 

4) Loo = lirrifc-joo L k exists. 

5) Loo = L°°(X, p*) for all positive semidefinite X. 

Proof: 

1) The proof is reported in the Appendix. 

2) Suppose that the Problem [5] is feasible up to time k. To prove 
the problem is also feasible at time k + 1, we only need to 
find one p G V and L(Lk,p) < Lk- If we choose p = 
Pfc then, becasue p^ is the solution at time k, it follows that 
Pfc G V. It remains to prove that L(Lk,Pk) < Lk, which 
can be proved by noticing that L k — L(L k -i,Pk) < L k -i 
and L(X , p) is monotonically increasing with respect to X. 
Similarly, Problem [5] is also feasible at time 1 and then, by 
induction, Problem [5] is always feasible. 

3) It is easy to see that p k is bounded because pk,i G [0, 1]. 
By means of the Bolzano-Weierstrass Theorem, this implies 
that there always exists an accumulation point p*. Moreover, 
because p^ G V and V is closed, p* G V . 

4) Because {Lk} is decreasing and Lk > for all k, the limit 
must exist. 

5) The proof is reported in the Appendix. 



Remark 6. It is worth noticing that in general it may exist more 
than one set of ttt , VT G T with the same marginal probabilities. 
One possible way to determine ttt is as follows: 

1) Sort the marginal probability pi, suppose that pi x > pt 2 > 

■■■>Pi m - 

2) Define T = {s }, Tj = I,-_i \J{ij}. 

3) Choose -KT a = 1 — Pi!, tti = Ph — Pi 2 ,TYT 2 — Pi-2 — 
Pi 3 ,---,^T m = Pim- 
One can easily verify that Ti G T and ttt are compatible with the 
marginal probability. 

B. Implementation 

In this subsection we discuss a possible implementation of our 
sensor selection algorithm. We assume that a fixed random schedule 
p is used. Since the optimization does not depend on the real-time 
sensor measurement yk, the optimization step is performed off-line 
in a centralized fashion. Each sensor i stores its optimal pt and pj 
of all its children. 

At each time k, we have to select one subset of sensors according 
to the marginal probabilities p. However, we do not want the 
fusion center to query the nodes because this would increase the 



communication overhead, defying the purpose of sensor selection. 
To overcome this problem, we propose the following algorithm: 

1) Every sensor is equipped with the same random number gen- 
erator and the same seed. 

2) At time k, each sensor draws a random number ctk from the 
random number generator. 

3) If sensor i has no children, then it compares ctk with pi. If ctk < 
Pi, then it transmits the measurement to its parent. Otherwise, 
it does not transmit anything. 

4) If sensor i has children, then it compares ctk with pj, where j 
is the index of its child node. If ctk < Pj, then sensor i knows 
that child j will forward an observation packet to him. After the 
node i receives all the observation packets from its children, it 
merges all packets and its own observations into a single packet 
and forwards it to its parent. If at > pj for all j child of i, 
then the node i compares ctk with p;. If ctk < Pi, then sensor 
i transmits its measurements to its parent. Otherwise, it does 
not transmit anything. 

Because all sensors are equipped with the same random number 
generator and the same seed, every sensor gets the same ctk at 
time k. Hence, the above algorithm guarantees that all sensors agree 
on the same transmission topology T which satisfies the marginal 
distribution p. It is worth to remark that in such a scheme the only 
communication needed is the transmission of the observation packets 
and no communication overhead for coordination purposes is needed. 

Remark 7. It is worth mentioning that since all the sensors agree on 
the same ctk, it is very easy to implement a Time Division Multiple 
Access (TDMA) protocol to avoid wireless interference. 

IV. Simulation Result 

In order to show the effectiveness of the proposed method we apply 
our stochastic sensor selection algorithm to a numerical example in 
which a sensor network is deployed to monitor a diffusion process 
in a I x I planar closed region, whose model is given by 



u t — aV u. 



(21) 



where V 2 is the Laplace operator. u(t, xi,x 2 ) denotes the tempera- 
ture at time t at location (x 1,0:2) and a indicates the speed of the 
diffusion process. 

We use the finite difference method to discretize this model by 
dividing the region into Ira x lin grids and time into Is slot. If 
we group all temperature values at time k in the vector Uk — 
[u(k, 0, 0), ... , u(k, 0,N-l),u(k, 1,0),..., u(k, N-1,N- 1)] T , 
we can write the evolution of the discretized system as Uk+i — AUk, 
where the A matrix can be computed from discretization. If we intro- 
duce process noise, Uk will evolve according to Uk+i = AUk + Wk, 
where Wk G A/"(0, Q) is the process noise. 

We suppose that the fusion center is located in the bottom left 
corner at position (0,0). We assume that ra sensors are randomly 
distributed in the region and each sensor measures a linear combina- 
tion of temperature of the grid around iy. In particular, if we suppose 
the location of sensor / of coordinates (ai,<Z2) is in the cell [i,j],i.e. 
a-i G [i, i + 1) and 02 G [j, j + 1), the measurement of this sensor is 

y k ,i = [ (l-Aai)(l-Aa 2 )u(k,i,j) + Aai(l-Aa 2 )u(k,i + l,j)+ 

(1 - Aa 1 )Aa 2 u(k,i,j + 1) + Aa 1 Aa 2 u(k, i + l,j + l)] /h 2 + v k ,i- 

where Aai — ai — i, Aa 2 = a 2 — j and Vk,i is the measurement 
noise of sensor I at time k. Indicating with Yk the vector of all the 
measurements at time k, it follows that: Yk = CUk + Vk, where Vk 
denotes the measurement noise at time k assumed to have normal 



7 We do not require the sensors to be placed at grid points 



distribution 7V(0, R) and C is the observation matrix. Finally, we 
assume that the sensor network admits a minimum spanning tree 
topology with communication cost from sensor i to j is 



cost(e 



= c + 






where dij is the Euclidean distance from sensor i to sensor j and 
c is a constant related to the sensing energy consumption^ For the 
simulations, we impose the following parameters: I = 3 m, m = 16, 
a = 0.1 m 2 /s, Q = I = R = I £ R 16x16 , S = 4/ £ R 16x16 , 
£d = 6,c = 1. 

We compare the performance of the optimal fixed stochastic sched- 
ule with optimal fixed deterministic schedule found by exhaustive 
search. Figure [TJ shows the histogram of the ratio between trace(Poo) 
of deterministic schedule and trace(EPoo) of stochastic schedule, 
which is generated by 100 random experiments. The blue dashed line 
is the average ratio. It can be seen that the deterministic schedule is 
always worse than the stochastic one. Figure [TJ shows the trace of Pj, 
for the optimal deterministic fixed schedule, together with the trace 
of Pk from a sample path of the stochastic fixed schedule and the 
EPk of the stochastic fixed schedule for one random experiment. 




Fig. 1. Histogram of the ratio between trace(P ac ) of 
deterministic schedule and trace{EP oa ) of stochastic schedule 
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- Sample Pk of Stochastic Fixed Schedule 
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Fig. 2. Evolution of trace(Pk) 



V. Conclusions 

In this paper, we propose a stochastic sensor selection algorithm 
for a tree topology wireless sensor network. We solve the optimal 
stochastic sensor selection problem after relaxation by means of 
convex optimization. We also provide a possible implementation 
of our random sensor selection algorithm without introducing any 
communication overhead. Finally we discussed extensions to general 

8 c models the fact that as the distance goes to zero the communication cost 
does not 



graphs and to the case of unreliable communications. Examples 
show interesting results regarding the effectiveness of the proposed 
approach. 
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Appendix 
First, let us state the following Proposition: 
Proposition 3. Define functions f(X),h(X) to be 

f{X)=X-\ (22) 

h{X) = (AX- 1 A' + Q)- 1 . (23) 

where X £ R n x " is positive definite and T £ T. Then the following 
statements hold: 

1) f(X) is convex and monotone decreasing; 

2) h{X) is concave and monotone increasing; 

Proof of Theorem^ By the definition of L k , we know that 

m C'C 

^ = L k\i-i+J2 Pk -->-t J -> L k\l-i = (ALk-iA' + Q)" 1 . 

Let us define Z k — P^ , Zh\k-l = Pj~i k _ v We will first prove 
L k 1 > EZ k by induction. When k — 0, Lq — Z = P - . Suppose 
that L~^_ t > EZfc-i, since Pfc|fc_i = AP k -\A' + Q, we know that 

Z fe | fc _! = (AZ^' + Qr 1 = fc(Z fc _i). (24) 

By taking the expectation on both sides, we get 

E^lt-i = Eh(Z fc _i) < ft(E2n) < h(^*-i) = ifc|Lr ( 25 ) 

The first inequality is a consequence of the concavity of h(X) 
and Jensen's inequality. The second inequality is derived from the 
monotonicity of h(X) and from the fact that L'j^_ 1 > EZj-i. Now, 
by ((5), we know that 



C,Ci 



EZ fe = E^| fe _ 1+ ^p M ^^ < ELiii_ 1 +X;Pk,i i ^ i = ifc 1 



r, 






(26) 
Hence, for all fe, L^T 1 > EZfe. Now, by the definition of Zfe, we 
know that 

fl k = ^ 1 =/(Z fc ). 

Since / is convex, by Jensen's inequality, the following inequalities 
result 

EPfc = Ef(Z k ) > f(EZ k ) = (EZ fe ) _1 > L k . (27) 



Proof of Theorem \3} Fix X, 



/ m 

L(X,p) = f ((AXA T + Q)- 1 + J2 






Pi- 



CiCi 



Since, / is convex and (AX A +Q)~ 1 + '}2™ =1 piCiCi/ri is linear 
with respect to p, L is convex with respect to p. Once p is fixed, it 
is easy to see that L is of the same form as h. By similar arguments, 
L is concave and monotone decreasing with respect to X. ■ 

Before proving Theorem [4] we need the following lemmas: 

Lemma 1. Consider matrix C p = [y/piCi , . . . , y/PmC m ] . If the 
pair (C P ,A) is detectable, then the following limit exists for all 
positive semidefmite matrices X: 

L°°(X,p) = lim L (k) (X,p). 

k—^oo 

Moreover, if the pair (A, Q ' ) is controllable, then the above limit 
is unique regardless of X. 



Proof: Let us build a linear system whose dynamics are given 



by 



x k +\ 

ilk 



Ax k + w k , 
C p x k + Vk- 



where x ~ J\f(0,X), w k ~ Af(0,Q), v k ~ J\f(0,R) and all 
of them are mutually independent of each other. Consider now the 
covariance matrix of the Kalman filter for the above system, which 
is given by 



Po = X, 
P k+llk = AP k A T + Q, 



b k+i = (P k + 1]k + c P R % 



Pk+i\k+Y.P' 



CiCj 



(28) 

(29) 
-l 

(30) 



By construction, such a covariance matrix satisfies P k = L^ k '(X,p) 
and hence the limit Poo = linifc^oo P k exists if (C p , A) is detectable. 
Moreover, the limit is unique regardless of Po if (A, Q ' ) is 
controllable. ■ 

Another theorem on the uniqueness of the limit can also be provided: 

Lemma 2. Let Q > be a strictly positive definite matrix. If there 
exists a fixed point Xq satisfying 

X = L(X ,p), 

then L°° (X, p) exists and moreover 

L°°(X,p) = Xq, for all X positive semidefmite. 

Proof: First, we want to show that L(X, p) is strictly positive 
for any X > 0. By definition we have 

-l 



L(X,p) = 



(AXA T + Q)- 1 +Y^p i 



CiCi 



r, 



> [q-'+Ei 



CiGj 



In particular, this implies that Xq > 0. Now, because L(X, p) is 
concave in X, we obtain: 

1 



-L(0,p) < L(X ,p) = X .\/a> 1 



1 1 a — 

—L(aXo,p) < —L(aXo,p)-\ 

a a a 

As a result, L(aXo,p) < aXo and, exploiting the monotonicity of 
L(X,p), the following inequality holds 

< L (k+1) (aX ,p) < L (k) (aX ,p). 

Then L^ k '(aXo,p) is bounded regardless of k. Because Xq > 
for any X positive semidefmite, there exists a scalar a x > 1, such 
that X < a x Xo, then, using again the monotonicity of L(X,p), 
one can prove that D k '(X, p) < U- '(a x Xo,p) is also bounded 
regardless of k. Hence, the pair (C p , A) must be detectable, which 
implies that L°°(X, p) exists for all X. Moreover, since Q > 0, the 
limit is unique and it must be Xq. ■ 

Now we are ready to prove Theorem [4] 
Proof: 

1) It is easy to check that C po = \/£d/(Y^ =1 ci)C and p G 
V. Since (C, A) is detectable, (V<W(E™ [ c7)C,A) is also 
detectable and then Lo exists. 

5) By the definition of accumulation point, there is a subsequence 
Pn i P»2 1 ■ ■ ■ which converges to p* . For each index i k we have 

L(Li k -i,Pi k ) — L ik . 

If we take the limit on both side and exploit the fact that 
L(X, p) is continuous, we obtain 

-^(-^oo,P ) = -kco, 

and finally by Lemma [2] the limit is unique. 



